Shared in planta population and transcriptomic features of nonpathogenic members of endophytic phyllosphere microbiota

Significance Plants evolved in an environment colonized by a vast number of microbes, which collectively constitute the plant microbiota. The majority of microbiota taxa are nonpathogenic and may be beneficial to plants under certain ecological or environmental conditions. We conducted experiments to understand the features of long-term interactions of nonpathogenic microbiota members with plants. We found that a multiplication–death equilibrium explained the shared long-term static populations of nonpathogenic bacteria and that in planta bacterial transcriptomic signatures were characteristic of the stationary phase, a physiological state in which stress protection responses are induced. These results may have significant implications in understanding the bulk of “nonpathogenic” plant–microbiota interactions that occur in agricultural and natural ecosystems.


In vitro bacterial growth and antibiotics used
lists the bacterial strains used in this study. Escherichia coli strains were grown in LB (Lennox) medium at 37 °C, while all other strains were grown in either a modified LB medium (LM: 10 g L -1 tryptone, 6 g L -1 yeast extract, 1.5 g L -1 KH 2 PO 4 , 0.6 g NaCl, and 0.4 g MgSO 4 7H 2 O) or King's B medium at 30 °C.

Plant growth conditions
Arabidopsis plants were grown in a growth chamber with the following conditions: 12-hour day length, 80 µmoles m -2 s -1 of photon flux, a temperature of 24.5 °C during the day and 23.0 °C during the night, and a relative humidity between 65% and 75%. Table S11 lists the plant material used for this study. For some experiments ( Figure S2B, and for the in planta RNA-Seq samples of Achromobacter xylosoxidans and Pandoraea), the relative humidity was increased to be higher than 99% after bacterial inoculation. Seeds were stratified for 2 to 5 days before sowing. Plants were watered with one-half strength Hoagland's solution when needed. All plants were grown partially covered under a plastic transparent dome.

Bacterial population density quantification assays
Bacterial inoculum suspensions were prepared from cultures grown in LM or KB agar plates to stationary phase. Bacteria were resuspended in 0.25 mM MgCl 2 to an appropriate OD 600 , after which they were infiltrated using a needleless syringe into the abaxial side of Arabidopsis leaves. To determine the in planta population density numbers, leaf-disc punches were collected from plants and ground in 250 µL of 10 mM MgCl 2 using a TissueLyser II (QIAGEN; 2 cycles of 30 seconds at 25 Hz) and 3-mm zirconium oxide beads (Glen Mills Inc.). Serial dilutions of the ground tissue were spotted onto LM plates with appropriate antibiotics 3 and grown overnight. Colony forming units (CFUs) per cm 2 were determined for each sample, while CFUs per mL were determined for each inoculum.
To determine endophytic bacterial populations, leaves were placed in a 0.825% sodium hypochlorite solution for 1 minute, and then washed twice in distilled H 2 O, each for a minute. Leaves were blotted dry, after which leaf discs were collected as described above.
As described before (1), the in planta transcriptome analysis requires an inoculum greater than 10 9 CFU mL -1 ; otherwise the sequencing would be overwhelmed by mostly plant RNAs. The observed slight decrease in population densities between 6 and 24 hours ( Figure S4A, S4B, and S4C) could be caused by a strong initial PTI response due to the overabundance of MAMPs in the inoculum.
For in planta antibiotic treatments, 400 µg mL -1 of a β-lactam antibiotic, carbenicillin or cefotaxime, was infiltrated into plants every day until the end of the experiment starting at 1 day post-inoculation with the bacterial endophytes. H 2 O was used as a mock control. To calculate the percentage of cells that attempted to divide, we divided the average CFU cm -2 treated with the β-lactam antibiotic by the average CFU cm -2 mock-treated with H 2 O, multiplied this by 100, and subtracted the result from 100.
In the second step, two DNA fragments were PCR amplified using primer pairs AVL001 and AVL002, and AVL003 and AVL004 (Table S12): one amplicon included the promoter P14g and translational coupler BCD2 from plasmid pBG42 (6), while the other amplicon included the gene coding for the fluorescent protein mCerulean3 (7). These fragments were introduced into the previously generated intermediate plasmid (after cutting the plasmid with SpeI [New England Biolabs®, Inc.]) by Gibson assembly to generate the division reporter pUC18-mini-Tn7T-Gm::tetR(BD)-P tet -mCitrine_mCerulean3-BCD2-P 14g ( Figure 3A).

Gene integration into the P. syringae genome
Integration into the Pst genome was performed by site-specific transposition by Tn7 into the unique attTn7 site of Pst DC3000. For this, triparental mating was set up between the Pst recipient strain, the strain containing the transposase helper plasmid pTNS3 (8), and the E. coli RHO5 donor strain (9) (carrying plasmid pUC18-mini-Tn7T-Gm with the genes to be integrated). Three to six days after conjugation and selection in plates containing 1.5 µg mL -1 of gentamycin, colony PCRs were set up to identify the putative 4 transconjugants using three primer pairs: AVL005 and AVL006, AVL007 and AVL008, and AVL005 and AVL007 (Table S12).

Confocal microscopy
Images were taken using the Nikon A1Rsi confocal microscope with a 20X objective (numerical aperture of 0.7 and a pinhole of 1.2 airy units). The CFP and YFP channels are detected using gallium arsenide phosphide photomultiplier (PMT) detectors. For the CFP channel, an excitation of 443.6 nm and an emission between 467 to 502 nm were used. For the YFP channel, an excitation of 513.9 nm and an emission of 530 to 600 nm were used. Images were acquired at a gain and offset at which the negative control (Arabidopsis plants infected with Pst DC3000 or Pst ∆hrcC∆CFA without the fluorescent division reporter) had no signal being detected.
To determine the distribution of the overlap in signal intensity between mCitrine and mCerulean in segmented regions of interest, we used ImageJ (version 2.3.0/1.53f).

RNA extraction
For and S7D), we initially tried using a high bacterial inoculum, similar to the inoculum that had been used in previous in planta transcriptomic studies (1,12). However, even though a faster hypersensitive response is observed at a high inoculum in Bu-22 plants (13), ETI restriction of bacterial multiplication is compromised ( Figure S7E). As such, we decided to use a lower inoculum for evaluating gene expression in an incompatible interaction ( Figure S7F). Samples for ETI-inducing Pst DC3000 were collected starting at 24 hours post-inoculation, once population density became static ( Figure S7F).  (14).

RNA-Sequencing and bioinformatic analysis
The read quality was evaluated with FASTQC (version 0.11.7) (15), while adapters and low quality sequences were trimmed with Cutadapt (version 2.9) (16). Three different pipelines were used to calculate differential gene expression (DGE), as DGE varies depending on the models used for transcript estimation 6 (17). In the first pipeline, reads were aligned to the reference genomes using HISAT2 (version 2.1.0) (18).
Aligned reads were processed using SAMTools (version 1.9) (19), after which StringTie (version 2.1.3) (20) was used to calculate transcript frequency, to finally estimate DGE using DESeq2 (average expression per treatment is shown on Datasets S1, S2, and S3) (21). On the second pipeline, reads were also aligned with HISAT2, but this time DGE was calculated directly using Cuffdiff (version 2.2.1) (17). In the third pipeline, reads were pseudo-aligned using Salmon (version 0.11.3 for Pst ∆hrcC∆CFA or 1.2.1 for the other two endophytes) (22), after which DESeq2 was performed for DGE. Differentially expressed genes present in at least one of the three analyses were used for subsequent analysis. Comparison of the three pipelines for each endophyte is shown in Tables S13, S14 and S15. post-inoculation (hpi) into Col-0 plants, to identify sets of genes whose expression increased or decreased once phyllosphere bacteria were inside plants. Analysis used the logarithmically (in base 2) normalized expression data obtained from StringTie. GO pathway enrichment analysis used the BINGO application, and summarization of the results was done as described above (Table S4, S5, and S6).
Orthologous gene groups between bacterial strains were identified using OrthoMCL (28). Each orthologous group may have more than one gene from each strain, and were identified as having an  Fig. 2B, but, using a relative humidity of over 99% throughout the experiment. Pst ∆hrcC∆CFA was inoculated at 10 7 CFU mL -1 , and 400 µg mL -1 of carbenicillin was infiltrated daily starting at 24 hours post-inoculation. Over 98% of the population is killed over the course of 5 days. (C) In vitro β-lactam antibiotic effect of 400 µg mL -1 of carbenicillin or cefotaxime on stationary-phase Rhodococcus sp. 964 and Pandoraea sp. Col-0-28. The y-axis is in logarithmic scale. (D) In planta population density of Rhodococcus sp. 964 (inoculum: 2 × 10 7 CFU mL -1 ) after addition of 400 µg mL -1 of carbenicillin. Over 87% of the population was killed after 5 days.
Individual biological repetitions for each treatment are shown as circles. Error bars indicate the standard error of the mean. Different letters indicate differences in means, as judged by a Tukey HSD test (p < 0.05). dpi indicates days post-inoculation. DC3000 P tet -mCitrine_P 14g -mCerulean3 + aTc 2 dpi                 Comparisons for enrichment of biological processes were done between the inoculum (LM), 6 (t6), 24 (t24) and 168 (t168) hours post-inoculation into Arabidopsis leaves. An asterisk denotes that a biological process had less than 5 DEGs. False discovery rate adjusted p-values (q-values) calculated by BINGO for each comparison are shown.

YFP channel CFP channel Composite
1.05E-02 Serine family amino acid catabolism* Osmolyte (ectoine) biosynthesis* Osmolyte (ectoine) biosynthesis* Comparisons for enrichment of biological processes were done between the inoculum (LM), 6 (t6), 24 (t24) and 168 (t168) hours post-inoculation into Arabidopsis leaves. An asterisk denotes that a biological process had less than 5 DEGs. False discovery rate adjusted p-values (q-values) calculated by BINGO for each comparison are shown.  An image of the expression profile is shown below the list of processes, starting from the zero time point (the inoculum; the x-axis denotes time [not to scale], while the y-axis indicates gene expression). The time points used for the analysis were the inoculum, 6, 24, and 168 hours postinoculation into Arabidopsis leaves. An asterisk denotes that a biological process had less than 5 genes. False discovery rate adjusted p-values (q-values) for biological process enrichment for each expression profile as calculated by BINGO are shown.

8.31E-03
Enriched Pst ∆hrcC∆CFA biological processes in Profile: Glutamine family amino acid metabolism An image of the expression profile is shown below the list of processes, starting from the zero time point (the inoculum; the x-axis denotes time [not to scale], while the y-axis indicates gene expression). The time points used for the analysis were the inoculum, 6, 24, and 168 hours post-inoculation into Arabidopsis leaves. An asterisk denotes that a biological process had less than 5 genes. False discovery rate adjusted p-values (q-values) for biological process enrichment for each expression profile as calculated by BINGO are shown.
Enriched Achromobacter xylosoxidans Col-0-50 biological processes in Profile: 6.28E-03 An image of the expression profile is shown below the list of processes, starting from the zero time point (the inoculum; the x-axis denotes time [not to scale], while the y-axis indicates gene expression). The time points used for the analysis were the inoculum, 6, 24, and 168 hours post-inoculation into Arabidopsis leaves. An asterisk denotes that a biological process had less than 5 genes. False discovery rate adjusted p-values (q-values) for biological process enrichment for each expression profile as calculated by BINGO are shown.

N-acetylglutaminylglutamine amide
Coronatine Gene expression used the logarithm in base 2 fold-change values calculated by DESeq2 after aligning the reads with StringTie when comparing the inoculum (KB) with 6 (t 6 ) and 168 (t 168 ) hours post-inoculation (hpi) into Arabidopsis leaves, and 6 with 168 hpi. Genes that are up-regulated are highlighted in cyan, while those that are down-regulated are highlighted in magenta. A column indicating if a gene was differentially regulated, as determined by DESeq2 or Cuffdiff (false discovery rate adjusted p < 0.05), is also shown. Gene clusters were identified as enriched using a hypergeometric enrichment test. The yersiniabactin gene cluster did not show differential expression, but is included as it is a cluster already known to be associated with secondary metabolite biosynthesis.
Gene expression used the logarithm in base 2 fold-change values calculated by DESeq2 after aligning the reads with StringTie when comparing the inoculum (LM) with 6 (t 6 ) and 168 (t 168 ) hours post-inoculation (hpi) into Arabidopsis leaves, and 6 with 168 hpi. Genes that are up-regulated are highlighted in cyan, while those that are down-regulated are highlighted in magenta. A column indicating if a gene was differentially regulated, as determined by DESeq2 or Cuffdiff (false discovery rate adjusted p < 0.05), is also shown. Gene clusters were identified as enriched using a hypergeometric enrichment test.  Table S9. Enrichment of plant-associated genes that were up-regulated within the differentially expressed genes in the transcriptome analysis.
Escherichia coli DH5α E. coli strain used for cloning of most constructs.
E. coli DH5α pUC18-mini-Tn7T-Gm Strain used for integration of genes into the attTn7 site of Pseudomonas.

E. coli MaH1
Strain is a derivative of E. coli DH5α with attTn7::pir116 integrated into the genome for expression of the π protein, which is necessary for R6K plasmid replication.
E. coli PIR2 pBG42 Strain used for integration of msfGFP into the attTn7 site of Pseudomonas.
E. coli RHO5 pUC18-mini-Tn7T-Gm::tetR(BD)-P tet -mCitrine_mCerulean3-BCD2-P 14g Strain used for integration of a dual fluorescent reporter for cell division into the attTn7 site of Pseudomonas. mCitrine is under the control of a tetracycline inducible promoter, while mCerulean3 is under the control of the constitutive promoter P 14g and the translational enhancer BCD2.
Pst DC3000 attTn7::tetR(BD)-P tet -mCitrine_mCerulean-BCD2-P 14g Pst DC3000 strain carrying the cell division reporter: mCitrine expressed under the control of a tetracycline inducible promoter and mCerulean3 under the control of the constitutive promoter P 14g and the translational enhancer BCD2.
Pst ∆hrcC∆CFA Non-pathogenic Pseudomonas syringae strain with the coronafacic acid cluster (CFA; 11 genes are deleted) and a gene necessary for type III secretion system formation (hrcC) deleted.
Pst ∆hrcC∆CFA attTn7T::tetR(BD)-P tet -mCitrine_mCerulean-BCD2-P 14g Non-pathogenic Pst ∆hrcC∆CFA strain carrying the cell division reporter: mCitrine expressed under the control of a tetracycline inducible promoter and mCerulean3 under the control of the constitutive promoter P 14g and the translational enhancer BCD2.

Plant genotypes Comments References
A. thaliana Bu-22 Carries RPS7, a resistance locus that recognizes the AvrPto effector.